Method, system, and program for managing input/output (I/O) performance between host systems and storage volumes

ABSTRACT

Provided are a method, system, and program for managing a network providing Input/Output (I/O) paths between a plurality of host systems and storage volumes in storage systems. An application service connection definition is provided for each connection from a host to a storage volume. At least one service level guarantee definition is provided indicating performance criteria to satisfy service requirements included in at least one service level agreement with at least one customer for network resources. Each service level guarantee definition is associated with at least one application service connection definition. Monitoring is performed as to whether Input/Output (I/O) requests transmitted through the multiple I/O paths satisfy performance criteria indicated in the service level guarantee definition associated with the I/O paths.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention is related to a method, system, and program formanaging I/O performance between host systems and storage volumes.

2. Description of the Related Art

A storage service provider may maintain a large network, such as a FibreChannel Storage Area Network (SAN), to service the computing needs forone or more customers. The SAN includes numerous host systems includingthe customer applications linked via a Fibre Channel fabric to one ormore storage systems, such as or one or more interconnected disk drivesconfigured as a Redundant Array of Independent Disks (RAID), Just aBunch of Disks (JBOD), Direct Access Storage Device (DASD), etc.Typically, a customer will pursue a service level agreement (SLA) withthe storage service provider concerning the criteria under which networkstorage resources are provided, such as the storage capacity, networkthroughput, I/O response time, I/O operations per second, and otherperformance criteria under which the network resources will be provided.In certain situations, multiple customers with different levels ofrequirements specified in their service level agreements will share thesame network resources. This requires that the storage service providermonitor and manage the network resources to ensure that the differentcustomer requirements specified in the different service levelagreements are satisfied.

Accordingly, there is a need in the art to provide a method to specifythe service level agreements and performance requirements to ensure thatcustomers of these storage resources receive service according to theagreed upon performance criteria for providing the network resources.

SUMMARY OF THE PREFERRED EMBODIMENTS

Provided are a method, system, and program for managing a networkproviding Input/Output (I/O) paths between a plurality of host systemsand storage volumes in storage systems. An application serviceconnection definition is provided for each connection from a host to astorage volume. At least one service level guarantee definition isprovided indicating performance criteria to satisfy service requirementsincluded in at least one service level agreement with at least onecustomer for network resources. Each service level guarantee definitionis associated with at least one application service connectiondefinition. Monitoring is performed as to whether Input/Output (I/O)requests transmitted through the multiple I/O paths satisfy performancecriteria indicated in the service level guarantee definition associatedwith the I/O paths.

In further implementations, multiple service level guarantee definitionsindicating different performance criteria are associated with differentsets of application service connection definitions.

In still further implementations, an application service group isprovided identifying a plurality of application service connectiondefinitions, wherein associating the service level guarantee definitionwith the application service connection definitions comprisesassociating each service level guarantee definitions with at least oneapplication service group, wherein the application service connectiondefinitions identified in the application service group are associatedwith the service level guarantee definitions with which theirapplication service group is associated.

In additional implementations, monitoring whether Input/Output (I/O)requests transmitted through the multiple I/O paths satisfy performancecriteria indicated in the service level guarantee definition comprises:gathering performance information concerning I/O requests for eachconnection; selecting one service level guarantee definition; and foreach connection identified by one application service connectiondefinition associated with the selected service level guaranteedefinition, comparing the gathered performance information for theconnection with the performance criteria indicated in the selectedservice level guarantee definition.

Additionally, the operations among the I/O paths represented by theapplication service connection definitions associated with the selectedservice level guarantee definition may be adjusted if the gatheredperformance information for the I/O paths does not satisfy theperformance criteria.

BRIEF DESCRIPTION OF THE DRAWINGS

Referring now to the drawings in which like reference numbers representcorresponding parts throughout:

FIGS. 1 and 2 illustrate network computing environments in whichembodiments of the invention are implemented;

FIGS. 3, 4, 5, 6, 7, and 8 illustrate an arrangement of information onI/O paths between hosts and storage volumes and service levelrequirements providing performance criteria for service level agreementsand associations therebetween in accordance with implementations of theinvention;

FIGS. 9, 10, 11, and 12 illustrate operations performed to utilize theinformation described with respect to FIGS. 3-8 to manage the networkshown in FIGS. 1 and 2; and

FIG. 13 illustrates a computing architecture that may be used toimplement the network components described with respect to FIGS. 1 and2.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

In the following description, reference is made to the accompanyingdrawings which form a part hereof and which illustrate severalembodiments of the present invention. It is understood that otherembodiments may be utilized and structural and operational changes maybe made without departing from the scope of the present invention.

FIG. 1 illustrates a network computing environment in which embodimentsof the invention are implemented. A plurality of host systems 2 a, 2 b,2 c including one or more application programs 4 a, 4 b, 4 c are incommunication with a plurality of storage systems 6 a, 6 b, 6 c over anetwork 8, such as a Fibre Channel Storage Area Network (SAN). The hostsystems 2 a, 2 b, 2 c may comprise any computing device known in theart, such as a server class machine, workstation, desktop computer, etc.The storage systems 6 a, 6 b, 6 c may comprise any mass storage deviceknown in the art, such one or more interconnected disk drives configuredas a Redundant Array of Independent Disks (RAID), Just a Bunch of Disks(JBOD), Direct Access Storage Device (DASD), as a tape storage device,e.g., a tape library, or etc.

A virtualization controller 10 is a system that is connected to the SAN8 and implements a virtualization layer 12 for the SAN 8 to present thestorage space available in the storage systems 6 a, 6 b, 6 c as one ormore common virtual storage pools. The virtualization layer 12 maps thephysical storage resources available in the storage systems 6 a, 6 b, 6c to virtual volumes in the virtualization layer 12. For instance,physical storage in different storage systems 6 a, 6 b, 6 c can beorganized in the virtualization layer 12 as a single virtual volume. Thevirtualization controller 10 further implements multiple performancegateways 14 a, 14 b.

Each Input/Output path between a host application 4 a, 4 b, 4 c, and astorage system 6 a, 6 b, and 6 c is assigned to a particular performancegateway 14 a, 14 b. The performance gateways 14 a, 14 b intercept theI/O command for the assigned path (e.g., host and application) andrecords performance data for the I/O command, such as access time, timeto complete, I/O throughput, etc. Thus, any I/O commands and datatransferred between the applications 4 a, 4 b, 4 c and storage systems 6a, 6 b, 6 c, represented as common storage pools in the virtualizationlayer 12, are handled by the performance gateway 14 a, 14 b to which thepath on which the I/O commands and/or data is transmitted. Theperformance gateway 14 a, 14 b sends any gathered performance data to aservice level agreement (SLA) server 16. The SLA server 16 includes anSLA database 20 including information on I/O paths and the criteria fordifferent service level agreements. The SLA server 16 processesinformation in the SLA database 20 determine how to process theperformance information received from the performance gateways 14 a, 14b. The SLA server 16 includes a performance analyzer 22 to analyze theperformance statistics received from the performance gateways 14 a, 14b. The performance analyzer 22 may generate reports on the results ofmeasuring I/O performance with respect to I/O paths among the hosts 2 a,2 b, 2 c and the storage systems 6 a, 6 b, 6 c. The throttling policies24 include information that indicates how the SLA server 16 is to adjustI/O activity to optimize performance based on the performanceinformation gathered at the performance gateways 14 a, 14 b. An servicelevel agreement (SLA) client 28 communicates with the SLA server 16using a protocol, such as the Hypertext Transfer Protocol (HTTP). A useror administrator may use the SLA client 28 to interface with the SLAserver 16 to provide input on service criteria and access performancereports and statistics generated by the performance analyzer 22.

In certain implementations, the virtualization controller 10 maytransmit the gathered performance data to the SLA server 16 over anothernetwork, such as a Local Area Network 26.

FIG. 2 provides a further illustration of the connection between thehosts 2 a, 2 b, 2 c and the storage systems 6 a, 6 b, 6 c. The pathsfrom the host applications 4 a, 4 b, 4 c connect to one of theperformance gateways 14 a, 14 b, implemented in the virtualizationcontroller 10, and from the performance gateways 14 a, 14 b to onevirtual volume 30 a, 30 b, 30 c, 30 d implemented in the virtualizationlayer 12. Different host applications in one host systems may havedifferent paths that are assigned to different or the same performancegateways. The storage in each virtual volume 30 a, 30 b, 30 c, 30 d mapsto one or more of the physical storage systems 6 a, 6 b, 6 c. Theconnection 32 from the performance gateways 14 a, 14 b to the SLA server16 (FIG. 1), which may comprise a LAN connection 26 (e.g., a TCP/IPconnection), is also illustrated.

The SLA database 20 includes information on service level agreementsbetween customers and the storage service provider that maintains theSAN network and storage resources. An administrator may use the SLAclient 28 to input the information on the service level agreements. Anapplication storage connection (ASC) record would be maintained for eachconnection between a host system 2 a, 2 b, 2 c and storage volume 30 a,30 b, 30 c, 30 d in the SAN 8 that is established pursuant to a servicelevel agreement. FIG. 3 illustrates the fields maintained in anapplication storage connection record 50 as including:

-   -   ASC Name 52: a unique name for an application storage connection        (ASC) between one host and a storage volume, either virtual or        physical.    -   Host 54: a unique identifier of a host used to represent the        host in the SAN 8, such as a world wide name (WWN) recognized in        the Fibre Channel network.    -   Performance Gateway 56: a port identifier in the performance        gateway 14 a, 14 b monitoring a connection between the        application host 2 a, 2 b, 2 c and a storage system 6 a, 6 b, 6        c.    -   Logical Volume 58: the name of a logical volume 30 a, 30 b, 30        c, 30 d or physical storage system 6 a, 6 b, 6 c which the host        54 can access through the connection represented by the ACS        record.

The storage service provider of the SAN 8 may further define, using theSLA client 28, one or more application storage groups (ASGs), eachidentifying one or more application storage connections, where a singleapplication storage connection, represented by one ASC record 50, may beincluded in multiple application storage groups. FIG. 4 illustrates anapplication storage group (ASG) record 70 as including a unique ASG name72 and the unique identifier of one or more application storageconnections (ASCs) 74 assigned to the group. In determining whichapplication storage connections to assign to an application storagegroup, the administrator may assign ASCs that all must satisfy a minimumperformance criteria defined within the one or more service levelagreements that specify performance criteria for the application storageconnections within the group.

The storage service provider may further define, using the SLA client28, a plurality of service level guarantees that define performanceguarantees defining the level of performance the storage ser4viceprovider must provide pursuant to one or more service level agreements.A defined service level guarantee may apply to one or more applicationstorage groups to define the level of service and performance expectedof the connections identified by the application service connections(ASCs) included in the application service groups (ASGs) to which theservice level guarantee is assigned. FIG. 5 illustrates informationincluded in a service level guarantee record 90, including:

-   -   Service Level Guarantee (SLG) Name 92: a unique name of the        service level guarantee.    -   Service Level Guarantee Class 94: the network storage service        provider may define classes that describe the nature of the        performance specified for the service level guarantee, such as        standard for general applications; premium transaction for        transactional applications that require high I/O operations per        second; and premium throughput for applications that require        high throughput, e.g., megabytes per second.    -   Priority 96: a numeric representation of the priority of the        SLG, from high to low.    -   Evaluation Interval 98: the time interval during which        performance is evaluated. This value may indicate a number of        units and a choice of time, such as five evaluations per hour,        etc.    -   Percentage Guarantee 100: a percentage of the I/O operations        that shall meet the defined performance requirements during the        evaluation period.    -   Mean Response Time (MRT) 102: a mean response time for each I/O        operation in milliseconds (MS).    -   Normalized Delivered I/O (NDI) 104: a normalized number of I/O        operations per second per contracted storage unit, e.g., 100        gigabyte of contracted capacity (IOPS/100 GB).    -   Normalized Delivered Throughput (NDT) 106: a normalized number        of megabytes per second perf contracted storage unit, e.g., 100        gigabytes of contracted capacity.

In certain implementations, different performance characteristics, e.g.,MRT, NDI, and NDT, may be specified for each performance class, e.g.,standard, premium transactions, premium throughput. The NDI 104 and NDT106 are demand metrics in that their value depends on the demand of thecustomer workload whereas the MRT is a delivery metric and is a measureof the quality of the service regardless of the workload. The SLA server16 would compare the actual measured performance metrics with the demandand response time criteria specified for the service level guarantee.For instance, if demand is less than agreed upon limits, then theresponse time is guaranteed to be less than the agreed upon MRT and theservice level agreement performance criteria is met. However, whendemand exceeds agreed upon limits, i.e., the actual throughput or I/Osper second exceeds the agreed upon limits, then the I/O access is exemptfrom the mean response time (MRT) requirement.

Another relational layer is the service level commitment (SLC) definedby the storage service provider which is used to apply one definedservice level guarantee (SLG) to one or more application service groups(ASG) for a particular customer having multiple hosts connected to theSAN 8. FIG. 6 illustrates a service level commitment (SLC) record 120 asincluding:

-   -   SLC Name 122: a unique name of the service level commitment.    -   Customer 124: name of the customer to which the assigned service        level guarantee applies.    -   SLG Name 126: the name of a defined service level guarantee        (SLG) 90 providing commitments for the named customer.    -   ASG Names 128: the name of one or more appellation service        groups including application service connections for the        customer to which the SLG will apply.    -   Retention Interval 130: the time period during which the system        will gather statistic data of ASGs defined by value and unit.    -   Reporting Interval 132: The interval at which the system will        send collected statistics on performance data of the ASGs,        defined by value and unit, to the performance analyzer 22 in the        SLA server 18. The collection and reporting intervals 130 and        132 may be adjusted.

The storage service provider may review a service level agreement for acustomer and then assign, using the SLA client 28, service levelcommitments to application service groups of application serviceconnections for a customer by defining service level commitments forthat customer. The storage service provider may enter information onconnections (ASCs), groups of connections (ASGs), performance criteria(SLGs), and the relation therebetween (SLCs) at the SLA client 28, wherethe defined ASCs, ASGs, SLGs, and SLCs are stored in the SLA database18. Each instance of the above records (e.g., ASC, ASG, SLG, and SLCrecords) may be implemented in an Extensible Markup Language (XML) fileor records within a database.

FIG. 7 illustrates a relationship among the different above definedrecords. A plurality of application service connection (ASC) 150 a, 150b . . . 150 l, each defining a connection from a host to one logical orphysical storage volume 152 a, 152 b, 152 c, and 152 d, are groupedwithin application service groups 154 a, 154 b, and 154 c, as shown bythe connecting lines. Different service level guarantees 156 a, 156 b,one heavy and light, respectively, are associated with different of theapplication service groups 154 a, 154 b, 154 c through service levelcommitment 158 a, 158 b associations.

When the service level agreement (SLA) server 16 determines that certainservice level guarantees are not being satisfied for an applicationservice group (ASG), then the SLA server 16 may apply a predefinedthrottling policy 24 to alter and effect the performance. Thisthrottling policy may cause the performance gateways 14 a, 14 b thatmanage the I/O paths to delay the I/O transmitted through I/O paths thatare over performing associated service level guarantees to improve theperformance of I/O paths that are underperforming.

FIG. 8 illustrates how the service level agreement server (SLA) 16gathers performance data by application service group 170. The SLAserver 16 will gather performance data for each application serviceconnection (ASC) 172 a, 172 b . . . 172 n (FIG. 8) defined for theapplication service group (ASG) 170, including the measured responsetimes 174 a, 174 b . . . 174 n for I/O operations, the number of I/Ooperations per second per 100 gigabytes of contracted capacity 176 a,176 b . . . 176 n (IOPS/100 GB), and the number of megabytes per secondper 100 gigabytes of contracted capacity 178 a, 178 b . . . 178 n(MBPS/100 GB). Such information may be maintained for each defined ASGand the ASCs defined therein.

FIG. 9 illustrates operations performed by the SLA server 16 whenreceiving (at block 200) performance data gathered by one performancegateway 14 b, the SLA server 16 locates (at block 202) the performancedata for the ASC 172 a, 172 b . . . 172 n (FIG. 9) for which theperformance data was received. The located performance data for the ASC172 a, 172 b . . . 172 c is updated with the new received performancedata. In this way, the performance data may maintain each instance of ameasured performance parameter, such as response time for each I/Orequest, I/O operations per second, I/O throughput.

FIG. 10 illustrates operations performed by the SLA server 16 to collectreporting data according to time limitations included in the servicelevel guarantees records 90 in the SLA database 20. For each defined SLCrecord 120 (FIG. 6) (at block 220), the SLA server 16 will gather (atblock 222) after the collection interval all the performance data forASCs 172 a, 172 b . . . 172 n within the ASG 170 identified in the ASGname field 128 (FIG. 6) in the SLC record 120. Further, at theexpiration of the reporting interval 132 for each SLC record 120, theSLA server 16 will collect (at block 224) all the performance data forASCs 172 a, 172 b . . . 172 n, including the measured response times 174a, 174 b . . . 174 n, measured IOPS/100 GB 176 a, 176 b . . . 176 n, andmeasured MBS/100 GB 178 a, 178 b . . . 178 n, within the ASG 170identified in the ASG name field 128 of the SLC record 120 beingprocessed, and send such gathered statistics and performance data to theSLA client 28. The SLA server 16 may run a timer, continually reset uponexpiration, for each collection interval 130 and reporting interval 132identified in each SLC record 120 to perform steps 222 and 224 when thetimer runs to zero. The timers would be reset and run again

FIGS. 11 and 12 illustrate logic implemented in the SLA server 16 todetermine whether application service connections (ASCs) 150 a, 150 b .. . 150 l (FIG. 7) satisfy service level guarantee definitions 156 a,156 b with which they are associated. Upon the expiration (at block 250)of the evaluation interval 98 for one SLG record 90 (FIG. 5), the SLAserver 16 would begin operations to determine whether the performancecriteria for the SLG 156 a, 156 b are being satisfied with respect tothe application service connections (ASCs) 150 a, 150 b . . . 150 l towhich the service level guarantees apply. The SLA server 16 mayinitialize a timer for each evaluation interval 98 in each SLG record 90to determine when to initiate the performance checking procedure. TheSLA server 16 then determines (at block 251) the service levelcommitment (SLC) 158 a, 158 b whose SLG name field 126 (FIG. 6)identifies the SLG 156 a, 156 b whose evaluation timer expired. From theASG name field 128 in the record 120 for the determined service levelcommitment (SLC) 158 a, 158 b, the SLA server 16 determines (at block252) the one or more application service groups (ASG) 154 a, 154 b, 154c that are associated with the service level guarantees (SLG) 156 a, 156b to check.

The SLA server 16 then performs a loop at blocks 254 through 268 foreach application service connection (ASC) 150 a, 150 b . . . 150 l ineach of the determined one or more application service groups (ASG) 154a, 154 b, 154 c. At block 256, the SLA server 16 determines theperformance requirements for the ASC i from the service level agreement(SLG) 156 a, 156 b being checked, including the mean response time (MRT)102, the normalized delivered I/O (NDI) 104, and the normalizeddelivered throughput (NDT) 106 (FIG. 5). These desired performancecriteria may be specified as a range, such as greater and/or less than avalue and unit. The SLA server 16 then determines (at block 258) whetherthe percentage of I/Os for ASC i that satisfies the MRT requirement 102is greater than or equal to the percentage guarantee 100 for the SLGbeing checked. As discussed, the ASC performance statistics 172 a, 172 b. . . 172 n would include the measured response times 174 a, 174 b . . .174 n for a measured time interval. The SLA server 16 may process thesemeasured response times to determine the percentage of such responsetimes that satisfy the MRT requirement 102 to accomplish the step atblock 258.

If (at block 258) the measured response times do satisfy the percentageguarantee 100, then the SLA server 16 determines (at block 260) whetherthe demand level for the connection represented by ASC i is less thanthe agreed demand level. As discussed, the measured demand is determinedfrom I/O operations per second 176 a, 176 b . . . 176 n and the numberof megabytes per second 178 a, 178 b . . . 178 n measured for the ASC iand whether this measured activity falls within the agreed upon demandparameters, e.g., the normalized delivered I/O (NDI) 104 and thenormalized delivered throughput (NDT) 106 indicated in the service levelguarantee 90 record being checked. If (at block 260) the demandsatisfies agreed upon SLG demand parameters, then a determination ismade (at block 262) whether the mean response time (MRT) of the measuredresponse times 174 a, 174 b . . . 174 n (FIG. 8) for ASC i is less thanor equal to the MRT 102 in the SLG record 90 (FIG. 5) for the servicelevel guarantee being checked. If (from the no branch of block 258) themeasured response time does not satisfy the percentage guarantees 100 orif (from the no branch of block 262) the measured mean response time(MRT) does not meet the agreed criteria indicated in the MRT field 102of the SLG record 90, then an indication is made (at block 264) that ASCi is underperforming. Otherwise, if the performance guarantee issatisfied (from the yes branch of block 258) and the demand does notexceed agreed upon levels (from the yes branch of 260), then indicationis made (at block 266) that the ASC i is over performing with respect tothe SLG requirements. Control then proceeds (at block 268) back to block254 to consider any further application service connections (ASCs) 150a, 150 b . . . 150 l in the ASGs 158 a, 158 b . . . 158 n to which theSLG 156 a, 156 b applies.

After processing all the ASCs, control proceeds (at block 270) to block280 in FIG. 12 where the SLA server 16 collects (at block 280) theresults of performance information and stores that information in theSLA database 20. The performance analyzer 22 receives the informationfor ASCs in ASGs, and determines the actions that need to be performed.If (at block 282) no ASCs are under performing, then the SLA server 16signals (at block 284) the performance gateway(s) 14 a, 14 b (FIGS. 1and 2), to which the just considered application service connections(ASCs) 150 a, 150 b . . . 150 l are connected, to end throttling for anyof the ASCs that are determined not to be under performing. This willend any throttling of ASCs which are not performing at the levelspecified in the checked service level guarantee (SLG), which meets thecriteria specified in the service level agreements associated with theASCs. If certain of the considered ASCs 150 a, 150 b . . . 150 l areunder performing (from the no branch of block 282) and some are overperforming (at block 286), then the SLA server 16 signals (at block 288)the performance gateways 14 a, 14 b, to which the just considered ASCsare connected, to throttle over performing ASCs in the ASG(s) associatedwith the SLG being checked to slow down their performance and directnetwork storage resources toward the under performing ASCs associatedwith the SLG. The signaled performance gateways 14 a, 14 b may delayforwarding I/O requests transmitted on the over performing ASCs toimprove the performance of the I/O requests transmitted on theunderperforming ASCs by giving priority to I/O requests transmitted onthe under performing ASCs. The performance gateway 14 a, 14 b may useany throttling technique known in the art to direct resources away fromover performing ASCs to the under performing ones.

If (from the no branch of block 286) there are under performing ASCs,but no over performing ASCs to throttle, then the SLA server 16generates (at block 290) an alert to notify the storage service providerof the underperforming ASCs. The storage service provider may benotified through the admin client 18.

FIG. 13 illustrates an implementation of the SLA server 16 and SLAclient 18 in a web service based architecture. The SLA client 300 mayinclude a web browser 302 that renders an SLA administrative graphicaluser interface (GUI) 304 through which the network administrator mayinteract with the SLA server 320. The SLA client 300 further includes aperformance monitor 306 component that presents the response time inreal-time to the network administrator from the performance informationgathered by the SLA server 320. The SLA server 320 includes an HTTPserver 322 to enable communication with the client web browser 302 and aJava servlet 324 to transfer information received at the HTTP server 322to an application server 326 and to transfer information from theapplication server 326 through the HTTP server 322 to the client webbrowser 302. For instance, the application server 326 may transferreal-time performance data from the SLA database 328 to the clientperformance monitor 306.

The application server 326 further collects configuration requests andsends monitoring displays to and from the servlet 324, and passesrequests to other components. The SLA database 328 comprises a databasemanager and database that maintains the various defined ASCs, ASGs,SLGs, SLCs, service level agreements, etc., stores collected performancedata, and generates reports. SLA services 330 may include the throttlingpolicies and conduct performance analysis and the policy basedthrottling control.

In the described embodiments, the SLA server 16 attempts toautomatically alter how I/O requests are processed to direct morenetwork storage resources to ASCs that are not satisfying certainservice level agreement criteria defined in service level guarantees. Incertain implementations, this is accomplished by throttling or delayingthe processing of I/O requests transmitted on ASCs that are overperforming. In this way, SLA server 16 may automatically adjust thenetwork to rebalance the distribution of network resources away fromapplication service connections that are over performing to underperforming ASCs. This allows adjustments to the network to boost underperforming I/O paths without having to add additional network storageresources.

In web service based architectures, such as shown in FIG. 13, all theinformation within the SLA database may be implemented as XML filescreated by the storage service provider at the SLA client 28. FIG. 14 aillustrates how the service level agreement elements, such as the ASGsand ASCs are defined in an XML format, where the ASGs and ASCs aretagged elements having as attributes the information. FIG. 14 billustrate how the service level commitment information is implementedin the XML format wherein a SLC element has various attributes thatdefine the information for the service level commitment (SLC). There maybe multiple SLC elements in one XML document or dispersed acrossmultiple XML documents. Similarly, FIG. 14 c illustrates how the servicelevel guarantee (SLG) information is implemented in the XML format witha SLG element including attributes defining the values for the specificSLG element. There may be multiple SLG elements in one XML document ordispersed across multiple XML documents.

Additional Implementation Details

The network management described herein may be implemented as a method,apparatus or article of manufacture using standard programming and/orengineering techniques to produce software, firmware, hardware, or anycombination thereof. The term “article of manufacture” as used hereinrefers to code or logic implemented in hardware logic (e.g., anintegrated circuit chip, Programmable Gate Array (PGA), ApplicationSpecific Integrated Circuit (ASIC), etc.) or a computer readable medium,such as magnetic storage medium (e.g., hard disk drives, floppy disks,tape, etc.), optical storage (CD-ROMs, optical disks, etc.), volatileand non-volatile memory devices (e.g., EEPROMs, ROMs, PROMs, RAMs,DRAMs, SRAMs, firmware, programmable logic, etc.). Code in the computerreadable medium is accessed and executed by a processor. The code inwhich preferred embodiments are implemented may further be accessiblethrough a transmission media or from a file server over a network. Insuch cases, the article of manufacture in which the code is implementedmay comprise a transmission media, such as a network transmission line,wireless transmission media, signals propagating through space, radiowaves, infrared signals, etc. Thus, the “article of manufacture” maycomprise the medium in which the code is embodied. Additionally, the“article of manufacture” may comprise a combination of hardware andsoftware components in which the code is embodied, processed, andexecuted. Of course, those skilled in the art will recognize that manymodifications may be made to this configuration without departing fromthe scope of the present invention, and that the article of manufacturemay comprise any information bearing medium known in the art.

The described embodiments define a particular arrangement of informationon I/O paths between hosts and storage (application serviceconnections), an arrangement of I/O paths (application service groups),service level criteria (service level agreements), and a service levelcommitment that associates service level criteria with particular I/Opaths. In alternative implementations, this relationship of servicelevel agreement criteria to actual host I/O paths may be represented inalternative relationships and data structures than described herein.

In the described implementations, the I/O paths between host and storageare handled by performance gateways implemented in a virtualizationcontroller. In alternative implementations, the monitoring of I/Orequests and I/O paths may occur at either the host or storage level,thereby avoiding the need for the use of a separate virtualizationcontroller and virtualization layer.

The storage volume associated with an application storage connection maycomprise a virtual volume managed in a virtualization layer or aphysical volume on a storage device.

FIGS. 11 and 12 describe specific operations occurring in a particularorder. In alternative implementations, certain operations may beperformed in a different order, modified or removed. Morever, steps maybe added to the above described logic and still conform to the describedimplementations. Further, operations described herein may occursequentially or certain operations may be processed in parallel. Yetfurther, operations may be performed by a single processing unit or bydistributed processing units.

FIG. 15 illustrates one implementation of a computer architecture 400 ofthe network components shown in FIGS. 1,2, and 7, such as the host,storage systems, virtualization controller, etc. The architecture 400may include a processor 402 (e.g., a microprocessor), a memory 404(e.g., a volatile memory device), and storage 406 (e.g., a non-volatilestorage, such as magnetic disk drives, optical disk drives, a tapedrive, etc.). The storage 406 may comprise an internal storage device oran attached or network accessible storage. Programs in the storage 406are loaded into the memory 404 and executed by the processor 402 in amanner known in the art. The architecture further includes a networkcard 408 to enable communication with a network. An input device 410 isused to provide user input to the processor 402, and may include akeyboard, mouse, pen-stylus, microphone, touch sensitive display screen,or any other activation or input mechanism known in the art. An outputdevice 412 is capable of rendering information transmitted from theprocessor 402, or other component, such as a display monitor, printer,storage, etc.

The foregoing description of the implementations has been presented forthe purposes of illustration and description. It is not intended to beexhaustive or to limit the invention to the precise form disclosed. Manymodifications and variations are possible in light of the aboveteaching. It is intended that the scope of the invention be limited notby this detailed description, but rather by the claims appended hereto.The above specification, examples and data provide a completedescription of the manufacture and use of the composition of theinvention. Since many implementations of the invention can be madewithout departing from the spirit and scope of the invention, theinvention resides in the claims hereinafter appended.

1. A method for managing a network providing Input/Output (I/O) pathsbetween a plurality of host systems and storage volumes in storagesystems, comprising: providing an application service connectiondefinition for each connection from a host to a storage volume;providing at least one service level guarantee definition indicatingperformance criteria to satisfy service requirements included in atleast one service level agreement with at least one customer for networkresources; associating each service level guarantee definition with atleast one application service connection definition; and monitoringwhether Input/Output (I/O) requests transmitted through the multiple I/Opaths satisfy performance criteria indicated in the service levelguarantee definition associated with the I/O paths.
 2. The method ofclaim 1, wherein each service level guarantee definition is implementedas a separate element in at least one Extended Markup Language (XML)document, the element for the service level guarantee includes theperformance criteria defined in the service level agreement, and whereinthe application service connection definition for each connection isimplemented as an element in a at least one XML document, wherein theattributes of the application service connection element provideinformation on the connection.
 3. The method of claim 1, whereinmultiple service level guarantee definitions indicating differentperformance criteria are associated with different sets of applicationservice connection definitions.
 4. The method of claim 3, wherein theapplication service definition for one connection may be associated withmultiple service level guarantee definitions, wherein the monitoringcomprises determining whether I/O requests transmitted through oneconnection satisfy the performance criteria of all associated servicelevel guarantee definitions.
 5. The method of claim 1, furthercomprising: providing an application service group identifying aplurality of application service connection definitions, whereinassociating the service level guarantee definition with the applicationservice connection definitions comprises associating each service levelguarantee definitions with at least one application service group,wherein the application service connection definitions identified in theapplication service group are associated with the service levelguarantee definitions with which their application service group isassociated.
 6. The method of claim 5, further comprising: providing aservice level commitment record associating one service level agreementdefinition with at least one application service group.
 7. The method ofclaim 5, wherein at least one Extended Markup Language (XML) documentincludes one element for each application service group, and wherein theelement for each application service group includes one sub-element foreach application service connection included in that application servicegroup, wherein each application service connection subelement includesattributes providing information on the application service connection.8. The method of claim 1, wherein monitoring whether Input/Output (I/O)requests transmitted through the multiple I/O paths satisfy performancecriteria indicated in the service level guarantee definition comprises:gathering performance information concerning I/O requests for eachconnection; selecting one service level guarantee definition; and foreach connection identified by one application service connectiondefinition associated with the selected service level guaranteedefinition, comparing the gathered performance information for theconnection with the performance criteria indicated in the selectedservice level guarantee definition.
 9. The method of claim 8, furthercomprising: adjusting operations among the I/O paths represented by theapplication service connection definitions associated with the selectedservice level guarantee definition if the gathered performanceinformation for the I/O paths does not satisfy the performance criteria.10. The method of claim 9, wherein adjusting the operations comprises:determining I/O paths that are over performing and under performing withrespect to the performance criteria; and throttling the transmission ofI/O requests through I/O paths that are over performing.
 11. The methodof claim 10, wherein throttling the transmissions comprises delaying theprocessing of I/O requests transmitted through the over performing I/Opaths.
 12. The method of claim 8, wherein gathering performanceinformation for I/O paths comprises determining an I/O response time andI/O demand at the I/O paths and comparing the determined I/O responsetime and the I/O demand with performance criteria for response time anddemand in the selected service level guarantee definition.
 13. Themethod of claim 12, wherein I/O demand comprises I/O operations persecond per unit of contracted storage capacity and I/O throughput percontracted storage capacity.
 14. The method of claim 13, wherein aconnection is under performing if a percentage of I/O response timesmeasured for the connection is less than a percentage guaranteeindicated in the selected service level guarantee definition.
 15. Themethod of claim 13, wherein a connection is under performing if the I/Odemand exceeds the demand criteria indicated in the service levelguarantee definition and a sampling of the measured I/O response timesis less than the response time criteria indicated in the service levelguarantee definition.
 16. The method of claim 1, wherein the I/O pathsare monitored by performance gateways monitoring I/O paths between hostsand storage volumes.
 17. The method of claim 16, wherein the networkcomprises a Storage Area Network (SAN) and the performance gateways areimplemented in a virtualization controller, and wherein the storagevolumes comprise logical volumes in a virtualization layer implementedin the virtualization controller.
 18. The method of claim 1, wherein theapplication service connections, service level guarantees, service levelguarantee definitions, application service definitions information, andthe monitoring of the I/O requests are provided by a server in a webservice architecture that interfaces with a client to provide real timeperformance information on the I/O paths to the client.
 19. A system formanaging a network providing Input/Output (I/O) paths between aplurality of host systems and storage volumes in storage systems,comprising: a processing unit; code executed by the processing unit tocause operations to be performed, the operations comprising: (i)providing an application service connection definition for eachconnection from a host to a storage volume; (ii) providing at least oneservice level guarantee definition indicating performance criteria tosatisfy service requirements included in at least one service levelagreement with at least one customer for network resources; (iii)associating each service level guarantee definition with at least oneapplication service connection definition; and (iv) monitoring whetherInput/Output (I/O) requests transmitted through the multiple I/O pathssatisfy performance criteria indicated in the service level guaranteedefinition associated with the I/O paths.
 20. The system of claim 19,further comprising: a storage medium including at least one ExtendedMarkup Language (XML) document having: (i) a separate element for eachservice level guarantee definition, wherein the element for the servicelevel guarantee includes the performance criteria defined in the servicelevel agreement; and (ii) a separate element for each applicationservice connection definition for each, wherein the attributes of theapplication service connection element provide information on theconnection.
 21. The system of claim 19, wherein multiple service levelguarantee definitions indicating different performance criteria areassociated with different sets of application service connectiondefinitions.
 22. The system of claim 21, wherein the application servicedefinition for one connection may be associated with multiple servicelevel guarantee definitions, wherein the monitoring comprisesdetermining whether I/O requests transmitted through one connectionsatisfy the performance criteria of all associated service levelguarantee definitions.
 23. The system of claim 19, wherein theoperations further comprise: providing an application service groupidentifying a plurality of application service connection definitions,wherein associating the service level guarantee definition with theapplication service connection definitions comprises associating eachservice level guarantee definitions with at least one applicationservice group, wherein the application service connection definitionsidentified in the application service group are associated with theservice level guarantee definitions with which their application servicegroup is associated.
 24. The method of claim 23, further comprising: astorage medium having at least one Extended Markup Language (XML)document that includes one element for each application service group,and wherein the element for each application service group includes onesub-element for each application service connection included in thatapplication service group, wherein each application service connectionsubelement includes attributes providing information on the applicationservice connection.
 25. The system of claim 19, wherein monitoringwhether Input/Output (I/O) requests transmitted through the multiple I/Opaths satisfy performance criteria indicated in the service levelguarantee definition comprises: gathering performance informationconcerning I/O requests for each connection; selecting one service levelguarantee definition; and for each connection identified by oneapplication service connection definition associated with the selectedservice level guarantee definition, comparing the gathered performanceinformation for the connection with the performance criteria indicatedin the selected service level guarantee definition.
 26. The system ofclaim 25, wherein the operations further comprise: adjusting operationsamong the I/O paths represented by the application service connectiondefinitions associated with the selected service level guaranteedefinition if the gathered performance information for the I/O pathsdoes not satisfy the performance criteria.
 27. The system of claim 26,wherein adjusting the operations comprises: determining I/O paths thatare over performing and under performing with respect to the performancecriteria; and throttling the transmission of I/O requests through I/Opaths that are over performing.
 28. The system of claim 25, whereingathering performance information for I/O paths comprises determining anI/O response time and I/O demand at the I/O paths and comparing thedetermined I/O response time and the I/O demand with performancecriteria for response time and demand in the selected service levelguarantee definition.
 29. The system of claim 19, further comprising:performance gateways to monitor the I/O paths between hosts and storagevolumes.
 30. The system of claim 19, wherein the processing unitcomprises a server in a web service architecture that interfaces with aclient to provide real time performance information on the I/O paths tothe client.
 31. An article of manufacture for managing a networkproviding Input/Output (I/O) paths between a plurality of host systemsand storage volumes in storage systems, wherein the article ofmanufacture causes operations to be performed, the operationscomprising: providing an application service connection definition foreach connection from a host to a storage volume; providing at least oneservice level guarantee definition indicating performance criteria tosatisfy service requirements included in at least one service levelagreement with at least one customer for network resources; associatingeach service level guarantee definition with at least one applicationservice connection definition; and monitoring whether Input/Output (I/O)requests transmitted through the multiple I/O paths satisfy performancecriteria indicated in the service level guarantee definition associatedwith the I/O paths.
 32. The article of manufacture of claim 31, whereineach service level guarantee definition is implemented as a separateelement in at least one Extended Markup Language (XML) document, theelement for the service level guarantee includes the performancecriteria defined in the service level agreement, and wherein theapplication service connection definition for each connection isimplemented as an element in a at least one XML document, wherein theattributes of the application service connection element provideinformation on the connection.
 33. The article of manufacture of claim31, wherein multiple service level guarantee definitions indicatingdifferent performance criteria are associated with different sets ofapplication service connection definitions.
 34. The article ofmanufacture of claim 33, wherein the application service definition forone connection may be associated with multiple service level guaranteedefinitions, wherein the monitoring comprises determining whether I/Orequests transmitted through one connection satisfy the performancecriteria of all associated service level guarantee definitions.
 35. Thearticle of manufacture of claim 31, wherein the operations furthercomprise: providing an application service group identifying a pluralityof application service connection definitions, wherein associating theservice level guarantee definition with the application serviceconnection definitions comprises associating each service levelguarantee definitions with at least one application service group,wherein the application service connection definitions identified in theapplication service group are associated with the service levelguarantee definitions with which their application service group isassociated.
 36. The article of manufacture of claim 35, wherein at leastone Extended Markup Language (XML) document includes one element foreach application service group, and wherein the element for eachapplication service group includes one sub-element for each applicationservice connection included in that application service group, whereineach application service connection subelement includes attributesproviding information on the application service connection.
 37. Thearticle of manufacture of claim 36, wherein the operations furthercomprise: providing a service level commitment record associating oneservice level agreement definition with at least one application servicegroup.
 38. The article of manufacture of claim 31, wherein monitoringwhether Input/Output (I/O) requests transmitted through the multiple I/Opaths satisfy performance criteria indicated in the service levelguarantee definition comprises: gathering performance informationconcerning I/O requests for each connection; selecting one service levelguarantee definition; and for each connection identified by oneapplication service connection definition associated with the selectedservice level guarantee definition, comparing the gathered performanceinformation for the connection with the performance criteria indicatedin the selected service level guarantee definition.
 39. The article ofmanufacture of claim 38, wherein the operations further comprise:adjusting operations among the I/O paths represented by the applicationservice connection definitions associated with the selected servicelevel guarantee definition if the gathered performance information forthe I/O paths does not satisfy the performance criteria.
 40. The articleof manufacture of claim 39, wherein adjusting the operations comprises:determining I/O paths that are over performing and under performing withrespect to the performance criteria; and throttling the transmission ofI/O requests through I/O paths that are over performing.
 41. The articleof manufacture of claim 40, wherein throttling the transmissionscomprises delaying the processing of I/O requests transmitted throughthe over performing I/O paths.
 42. The article of manufacture of claim38, wherein gathering performance information for I/O paths comprisesdetermining an I/O response time and I/O demand at the I/O paths andcomparing the determined I/O response time and the I/O demand withperformance criteria for response time and demand in the selectedservice level guarantee definition.
 43. The article of manufacture ofclaim 42, wherein I/O demand comprises I/O operations per second perunit of contracted storage capacity and I/O throughput per contractedstorage capacity.
 44. The article of manufacture of claim 43, wherein aconnection is under performing if a percentage of I/O response timesmeasured for the connection is less than a percentage guaranteeindicated in the selected service level guarantee definition.
 45. Thearticle of manufacture of claim 43, wherein a connection is underperforming if the I/O demand exceeds the demand criteria indicated inthe service level guarantee definition and a sampling of the measuredI/O response times is less than the response time criteria indicated inthe service level guarantee definition.
 46. The article of manufactureof claim 31, wherein the application service connections, service levelguarantees, service level guarantee definitions, application servicedefinitions information, and the monitoring of the I/O requests areprovided by a server in a web service architecture that interfaces witha client to provide real time performance information on the I/O pathsto the client.
 47. A system for managing a network providingInput/Output (I/O) paths between a plurality of host systems and storagevolumes in storage systems, comprising: means for providing anapplication service connection definition for each connection from ahost to a storage volume; means for providing at least one service levelguarantee definition indicating performance criteria to satisfy servicerequirements included in at least one service level agreement with atleast one customer for network resources; means for associating eachservice level guarantee definition with at least one application serviceconnection definition; and means for monitoring whether Input/Output(I/O) requests transmitted through the multiple I/O paths satisfyperformance criteria indicated in the service level guarantee definitionassociated with the I/O paths.
 48. The system of claim 47, whereinmultiple service level guarantee definitions indicating differentperformance criteria are associated with different sets of applicationservice connection definitions.
 49. The system of claim 47, furthercomprising: means for providing an application service group identifyinga plurality of application service connection definitions, whereinassociating the service level guarantee definition with the applicationservice connection definitions comprises associating each service levelguarantee definitions with at least one application service group,wherein the application service connection definitions identified in theapplication service group are associated with the service levelguarantee definitions with which their application service group isassociated.